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Abstract 


We report a measurement of the top quark mass using six candidate events 
for the process pp —> ft + A —> i'b£~Vb + X, observed in the D0 experiment 
at the Fermilab pp collider. Using maximum likelihood fits to the dynamics of 
the decays, we measure a mass for the top quark of mt = 168.4 ± 12.3 (stat) ± 
3.6 (syst) GeV. We combine this result with our previous measurement in 
the tt —> £ + jets channel to obtain mt = 172.1 ± 7.1 GeV as the best value of 
the mass of the top quark measured by D0. 
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I. INTRODUCTION 

The mass of the top quark is a free parameter in the standard model of the electroweak 
interactions [jl|. It arises from the Yukawa coupling of the top quark to the Higgs field, 
which is not constrained by the model. Through radiative corrections, the value of the top 
quark mass affects predictions of the standard model for many processes. For example, the 
prediction for the mass of the W boson varies by approximately 7 MeV[| for every 1 GeV 
change in the mass of the top quark [Q . Precise measurements of the masses of the top quark 
and the W boson constrain the mass of the Higgs boson. This dependence can be turned 
around and the top quark mass predicted from measurements of electroweak processes within 
the framework of the standard model. Such an analysis gives ISStn GeV for the top quark 
mass [^. In this sense, a measurement of the top quark mass constitutes a consistency test 
of the standard model prediction. 

The top quark is the only fermion with a mass close to the vacuum expectation value of 
the Higgs held, or equivalently, with a Yukawa coupling close to unity. It is therefore possible 
that by studying the properties of the top quark we can learn more about electroweak 
symmetry breaking. 

The Fermilab Tevatron produces top quarks in collisions of protons and antiprotons at 
a/s = 1.8 TeV. The Tevatron provided the hrst experimental conhrmation of the existence 
of the top quark @]. In pp collisions top quarks are produced predominantly in tt pairs. The 
standard model predicts the top quark primarily (> 99%) to decay to Wb. The decay modes 
of the W boson then dehne the signatures of tt decays. If both W bosons decay leptonically 
the signature contains two charged leptons with high px. We call this the dilepton channel. 
Events in which one of the W bosons decays leptonically and the other into jets contain one 
high prp charged lepton and high px hadron jets. We call this the lepton+jets channel. In 
the all-jets channel both W bosons decay into jets. 

The D0 collaboration was first to measure the mass of the top quark in the dilepton 
channel In this article we present a more detailed account of this analysis. The 

most precise measurements of the top quark mass have been obtained using the lepton-|-jets 
channel . Table | lists previously published measurements of the top quark mass. 

The measurement described in this paper is based on an integrated luminosity of approx¬ 
imately 125 pb~ , recorded by the D0 detector during the 1992-1996 collider runs. We hrst 
give a brief description of the experimental setup (Sect. 0, data reconstruction (Sect. ED 
and calibration procedures (Sect. |^. We then describe the selection of the event sample 
(Sect. 0), the mass analysis of the selected events (Sect. 0), the maximum likelihood 
ht to the data (Sect. 1^). and the systematic uncertainties associated with the ht (Sect. 
|VfHj) . Finally we summarize the results and combine them with the measurement in the 
lepton-|-jets channel (Sect. 0). 


^We use natural units with H = c = 1. 







TABLE I. Published measurements of the top quark mass. 

The hrst uncertainty is statistical. 

the second systematic. 



Experiment 

Channel 

Mass 

D0 0 

lepton-|-jets 

173.3± 5.6± 5.5 GeV 

D0 11 

dilepton 

168.4±12.3± 3.6 GeV 

CDF § 

lepton-|-jets 

175.9± 4.8± 4.9 GeV 

CDF i 

dilepton 

161 ±17 ±10 GeV 

CDF |10| 

all-jets 

186 ±10 ±12 GeV 


II. DETECTOR 


D0 is a multipurpose detector designed to study pp collisions at high energies. The 
detector was commissioned at the Fermilab Tevatron during the summer of 1992. A full 

0 


description of the detector can be found in Ref. 


Here, we describe only briefly the 


properties of the detector that are relevant for the mass measurement in the dilepton channel. 

We specify detector coordinates in a system with its origin dehned by the center of the 
detector and the ^-axis dehned by the proton beam. The x-axis points out of the Tevatron 
ring and the j/-axis up. We use (j) to denote the azimuthal coordinate and 6 for the polar 
angle. Rather than 9, we often use the pseudorapidity p = tanh“^(cos 0). 

The detector consists of three primary systems: central tracking, calorimeter, and muon 
spectrometer. A cut away view of the detector is shown in Fig. |^. 

The nonmagnetic central tracking system consists of four subdetectors that measure the 
trajectories of charged particles: a vertex drift chamber, a transition radiation detector, a 
central drift chamber, and two forward drift chambers. These chambers also measure ioniza¬ 
tion to identify tracks from single charged particles and e’''e“ pairs from photon conversions. 
The central tracking system covers the region |? 7 | < 3.2. 

The uranium-liquid argon calorimeter is divided into three parts, the central calorimeter 
and the two end calorimeters, and covers the pseudorapidity range |? 7 | < 4.2. Longitudinally, 
the calorimeter is segmented into an electromagnetic (EM) section with hne sampling and 
a hadronic section with coarser sampling. The calorimeter is segmented transversely into 
quasiprojective towers with Ap x A0 = 0.1 x 0.1. The third layer of the electromagnetic 
calorimeter, where EM showers are expected to peak, is segmented twice as hnely in each 
direction. The hadronic calorimeter modules back up any cracks in the coverage of the EM 
calorimeter modules such that there are no projective cracks in the calorimeter, ensuring 
good resolution for the measurement of transverse momentum balance. 

Since muons from top quark decays predominantly populate the central region, we use 
only the central portion of the muon system, which covers \p\ < 1.7. This system consists of 
four planes of proportional drift tubes in front of magnetized iron toroids with a magnetic 
held of 1.9 T and two groups of three planes of proportional drift tubes behind the toroids. 
The magnetic held lines and the wires in the drift tubes are oriented transversely to the beam 
direction. The momentum is obtained from the dehection of the muon in the magnetic held 
of the toroid. 
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III. PARTICLE IDENTIFICATION 


The particle identification algorithms used for electrons, muons, and jets are the same 
as in previously published analyses H- We summarize them in the following sections. 


A. Electrons 

Electron candidates are first identified by finding isolated clusters of energy in the EM 
calorimeter along with a matching track in the central detector. We accept electron candi¬ 
dates with \ri\ < 2.5. Final identification is based on a likelihood test on the following five 
variables: 

• The agreement of the shower shape with the expected shape of an electromagnetic 
shower, computed using the full covariance matrix of the energy depositions in the 
cells of the electromagnetic calorimeter. 

• The electromagnetic energy fraction, defined as the ratio of the shower energy found 
in the electromagnetic calorimeter to the total shower energy. 

• A measure of the distance between the track and the cluster centroid. 

• The ionization dE/dx along the track. 

• A variable characterizing the energy deposited in the transition radiation detector. 

To a good approximation, these five variables are independent of each other for electron 
showers. 

Electrons from W boson decay tend to be isolated. Thus, we make the additional cut 

< 0 . 1 , ( 1 ) 

Eem(0.2) 

where Etot(0.4) is the energy within Ai? < 0.4 of the cluster centroid and Eem(0.2) is the 
energy in the EM calorimeter within Ai? < 0.2. Ai? is defined as -^/Ap^ -|- A02. 


B. Muons 

Two types of muon selection are used in this analysis. The first is used to identify 
isolated muons from W —>• /rz/ decay. The second type of muon selection is used to tag 
6-jets by identifying muons consistent with originating from h ^ ^ + X decay. We accept 
muons with |? 7 | < 1.7. Besides cuts on the muon track quality, both selections require that 
the energy deposited in the calorimeter along a muon track be at least that expected from 
a minimum ionizing particle. For isolated muons, such as those from W boson decays, we 
require XR^ ,j ^ 0-^ for the distance in the rj — ((> plane between the muon and any 

jet. For soft muons in jets, such as those from b ^ fi + X decay, we require Pt > 4 GeV and 
The efficiencyX acceptance for either muon selection with these cuts is about 

64 %.’ 
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C. Jets 


Jets are reconstructed in the calorimeter using a fixed-size cone algorithm. We use a cone 
size of AR = 0.5. See Ref. |]T3 for a detailed description of the jet reconstruction algorithm. 


D. Missing Transverse Momentum 


The missing transverse momentum, is the momentum required to balance the mea¬ 
sured momenta in the event {J2Pt + = 0). In the calorimeter, we calculate as 



Ei sin Oi 



( 2 ) 


where i runs over all calorimeter cells, is the energy deposited in the cell, and is 
the azimuthal and 9i the polar angle of the cell. When there are muons present in the 
event we refine the calculation 


= (3) 

k 

where is the transverse momentum of the muon as measured by the muon system. 


IV. ENERGY SCALE CALIBRATION 
A. Electron Energy Scale 

The measurement of the energy E of electromagnetic showers in the calorimeter is cali¬ 
brated using Z —»• ee, J/il) —^ ee, and 7 r° —> 77 decays to a precision of 0.08% sA, E = Mzl‘2 
and to 0.6% at = 20 GeV H. The electron energy scale calibration therefore does not 
give rise to any significant uncertainty in the top quark mass measurement. 


B. Muon Momentum Scale 

The muon momentum scale, calibrated using J/-!/) —>• /x/i and Z —>• /x/i candidates, has an 
uncertainty of 2.5%. Its effect on our measurement of the top quark mass was determined 
by varying the muon momentum scale in Monte Carlo samples of tt events with rrit = 170 
GeV. The tests indicate that the relation between muon scale and top quark mass error is 
given by 

Jmt = 12 GeV-?. (4) 

Pt 

Hence, the 2.5% uncertainty in muon momentum scale leads to a systematic uncertainty 
of 0.3 GeV in our measurement of the top quark mass. This uncertainty is completely 
negligible compared to the effect of the jet energy scale. 
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C. Jet Energy Scale 


The jet energy scale is calibrated relative to the electromagnetic energy scale by balancing 
the transverse momentum in events with jets and electromagnetic showers |^. The exercise 
is carried out separately and symmetrically for both data and Monte Carlo. 

In addition to the corrections in Ref. 0 we apply an p-dependent correction derived 
from a comparison between 7 +jet events in data and Monte Carlo events created using 
event generator and a geant [|^ based detector simulation. We also 


the HERWIG [Or 


correct jets that contain a muon, indicative of a semileptonic b quark decay, to compensate 
on average for the energy carried away by the undetected neutrino. These corrections are 
identical to those used and detailed in the mass analysis based on the lepton+jets hnal 
states 10 ] with the exception that no attempt is made to account for gluon radiation outside 
of the jet cone. Rather, the procedure in the dilepton analysis is to explicitly account for 
additional reconstructed jets, as described in Sect. 

We estimate the degree of possible residual discrepancy between the jet energy response 
of the detector and the Monte Carlo simulation from the energy balance between elec¬ 
tromagnetic energy clusters and jets from collider data, compared to photon-|-jets Monte 
Carlo, as a function of photon pt- The data constrain the possible mismatch to less than 
±(2.5%-l-0.5 GeV) in the jet energy [0. This uncertainty gives rise to a signihcant systematic 
uncertainty in our top quark mass measurement (see Sect. [VIII B|) . 


V. EVENT SELECTION 


A. Basic Event Selection Criteria 


The event selection for the dilepton mass analysis is almost identical to that used for the 
measurement of the cross section 11 . We require two charged leptons (e, /i) and at least 
two jets in the events. In addition we cut on global event quantities like and Ht- The 
basic kinematic selection criteria are summarized in Table 0. The variable is dehned as 




Py for the ee and e/i channels; 
J2Pt for the pp, channel. 


(5) 


where ei is the leading electron in ee events. The sum is over all jets with pt> 15 GeV 
and \p\ < 2.5. Muons are not included in the sum because their momenta are measured less 
precisely. Ht gives good rejection against background processes, which typically have less 
jet activity along with the dilepton signature. 

The event selection criteria are designed to identify events with two charged leptons and 
additional jets in the hnal state as expected from ti ^ ii + X decays. The background in 
the ee and pp channels is dominated by Z —»• ee and Z ^ pp decays. We apply additional 
criteria, described in the following sections, that remove these particular backgrounds. Table 
ill gives the number of background events expected in each dilepton channel after all se¬ 
lection criteria are applied. Instrumental backgrounds arise from particle misidentihcation, 
e.g. mistaking a jet for an electron. 
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TABLE II. Kinematic and fiducial cuts used in selecting dilepton events. 


Objects 


ee 

efi 

hh 

2 Leptons 

Pt 

> 20 GeV 

> 15 GeV 

> 15 GeV 


\/\ 

< 2.5 

< 1.7 

< 1.7 

> 2 Jets 

p’t 

> 20 GeV 

> 20 GeV 

> 20 GeV 


W\ 

< 2.5 

< 2.5 

< 2.5 

Event 


— 

> 10 GeV 

— 


J. cal 

PT 

> 25 GeV 

> 20 GeV 

— 



> 120 GeV 

> 120 GeV 

> 100 GeV 


TABLE III. Expected numbers of background events. 


Background Source 

ee 

efi 

hh 

Z 

0.058 ±0.012 

— 

0.558 ±0.21 

Z ^ TT ^ U 

0.078 ± 0.022 

0.099 ± 0.076 

0.029 ±0.017 

ww 

0.083 ± 0.023 

0.074 ±0.018 

0.007 ± 0.004 

Drell-Yan 

0.054 ± 0.030 

0.002 ± 0.003 

0.066 ± 0.035 

tt^e + jets 

0.04 

— 

— 

Instrumental 

0.197 ±0.046 

0.035 ±0.13 

0.068 ± 0.010 

Total Background 

0.51 ±0.09 

0.21 ±0.16 

0.73 ± 0.25 


B. C/U Channel 

The e/i channel is the most powerful dilepton channel with twice the branching ratio 
of the ee and fifi channels and without the background from Z ^ ee or Z ^ fifi decays. 
The largest background is Z —rr —e/i + X, which is suppressed by both branching ratio 
and kinematics. Instrumental backgrounds arise from W bosons that decay to fiu which are 
produced in association with jets, one of which is mistaken for an electron. 

We observe three events in this channel. 


C. ee Channel 

The primary source of physics background in the ee channel is Z boson production with 
associated jets. These events have no neutrinos and can be rejected effectively by cutting 
on jIt- We therefore require > 40 GeV if the dielectron invariant mass is within 12 GeV 
of the Z boson mass peak. Instrumental backgrounds arise from LE+jets production or 
multijet events in which jets fake the electron signature. 

In this channel we extend our event selection criteria to include an additional event that 
was not part of the hnal sample for the measurement of the cross section. This event passes 
all selection criteria, except that one of the electron candidates has no matching track. This 
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cluster is nevertheless consistent with originating from an electron because the trajectory 
connecting the vertex with the cluster passes only through the two inner layers of the CDC. 
The inner two layers do indeed have hits but to reconstruct a track, hits are required in 
at least three layers. The lack of a reconstructed track could indicate a higher probability 
for this electron to be misidentihed. On the other hand one of the jets contains a muon, 
which passes all requirements for the muon-tag analyses reported in reference . A muon 
tag indicates that the jet probably originates from the fragmentation of a 6 quark. The 
probability of tagging a jet from the fragmentation of a light quark or a gluon is quite 
small. The presence of a 6 jet reduces the likelihood that this event arises from instrumental 
background sources and we therefore include it in the event sample for the mass analysis. 

We revise the background estimate for the ee channel to include an additional component 
due to the inclusion of this event. We compute the number of additional background events 
expected if events are admitted that are missing a matched track for one of the two electron 
candidates but have a muon tag. In our data we hnd 11 events with one electron candidate 
and three jets, one with muon tag. In these events, there are 22 jets that could fake a second 
electron. The probability for any one of these jets to mimic an electron signature without 
matched track requirement is 8 x 10“^ |^, so that we expect about 0.018 events due to 


the extension of the selection cuts. We also have to take into account that we specihcally 
extended the selection criteria to add this event. The additional background only contributes 
to experiments in which at least one event satishes the extended selection cuts. This is 
expected to happen only once every six experiments. The additional background component 
is therefore six times 0.018 or 0.11 events. The most signihcant source of these background 
events are tt decays to e-fjets with a muon-tagged jet, in which one jet is misidentihed as 
an electron. 

In total, two ee events enter our hnal sample. 


D. pp Channel 

The dimuon channel shares the Z ^ it background with the dielectron channel. The 
less precise measurement of the muon momentum makes separation of the ti signal from 
this background more difficult. In order to reduce this background, a kinematic ht to the 
Z —> pp hypothesis is applied, and the event is required to have probability less than 
1% for this ht. Even after this cut, Z boson production remains the dominant background 
source. Instrumental backgrounds arise from heavy quark jets with a high-p^^ muon that is 
misidentihed as an isolated muon. 

One event survives all selection criteria. 


E. Dilepton Events 

Six events enter our dilepton event sample: three are ep events, two are ee events, and 
one is a pp event. Table lYI lists the properties of these events. 
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TABLE IV. Kinematic properties of dilepton events (momenta in GeV) used in the reconstruc¬ 
tion of the top quark mass. All corrections are included. 


Event 

Object 

Px 

Py 

Pz 

PT 

P 

<P 

e//#l 

e 

12.3 

-97.8 

41.1 

98.6 

0.41 

4.84 



-68.3 

272.5 

95.1 

280.0 

0.33 

1.82 



100.5 

-152.7 

— 

182.9 

— 

5.29 


jet 

-25.5 

-9.9 

-20.8 

27.3 

-0.70 

3.51 


jet 

-14.4 

-20.5 

32.3 

25.1 

1.07 

4.10 

efi#2 

e 

-75.4 

-1.1 

-30.2 

74.5 

-0.39 

3.16 



-25.2 

10.6 

-12.8 

27.4 

-0.45 

2.75 


ir 

62.0 

5.2 

— 

62.3 

— 

0.08 


jet 

38.9 

-85.6 

-16.0 

94.0 

-0.17 

5.14 


jet 

14.2 

33.1 

-11.4 

36.0 

-0.31 

1.17 


jet 

-1.6 

29.3 

11.9 

29.4 

0.39 

1.63 


e 

-44.7 

20.2 

140.1 

49.1 

1.77 

2.72 



5.4 

17.2 

-3.3 

18.1 

-0.18 

1.27 



-12.5 

4.5 

— 

13.2 

— 

2.79 


jet 

39.6 

-29.9 

11.3 

49.7 

0.22 

5.64 


jet 

19.8 

-19.4 

-31.0 

27.7 

-0.97 

5.51 

ee#l 

e 

2.7 

50.4 

17.1 

50.5 

0.33 

1.52 


e 

-7.4 

21.4 

-47.6 

22.6 

-1.49 

1.91 


ir 

41.3 

-4.0 

— 

41.5 

— 

6.19 


jet 

-29.2 

-36.9 

-37.0 

47.1 

-0.72 

4.04 


jet 

3.5 

-27.1 

-28.9 

27.4 

-0.92 

4.84 

ee#2 

e 

52.3 

-4.1 

-34.4 

52.5 

-0.62 

6.20 


e 

-8.5 

-26.6 

27.0 

27.9 

0.86 

4.40 


ir 

42.6 

-11.3 

— 

44.1 

— 

6.02 


jet* 

-92.4 

-26.0 

-61.6 

96.0 

-0.60 

3.41 


jet 

-23.5 

25.3 

-34.0 

34.6 

-0.87 

2.32 


jet 

0.0 

27.7 

18.3 

27.7 

0.62 

1.57 



-63.9 

12.7 

-21.4 

65.1 

-0.32 

2.94 



-16.0 

31.0 

1.9 

34.9 

0.05 

2.05 



71.2 

53.2 

— 

88.9 

— 

0.64 


jet 

33.8 

-103.1 

-107.6 

108.5 

-0.88 

5.03 


jet 

-9.1 

22.7 

27.7 

24.5 

0.97 

1.95 


jet 

-8.4 

-18.6 

47.8 

20.5 

1.58 

4.29 


tagged by a soft muon 
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VI. RECONSTRUCTION OF THE TOP QUARK MASS 
A. Characteristics of Dilepton Events 

The dilepton decay topology does not provide sufficient information to uniquely recon¬ 
struct the t and t quarks. In the simplest scenario, the decay t —>• W~^b, t W~b, followed 
by 1U+ —»• £+z/ and W~ —*• i~V produces six particles in the hnal state: two charged leptons, 
which we allow to be either electrons or muons (ee, e/r, or /r/i); two neutrinos and two b 

quarks ( 6 , 6 ), as shown in Fig. Given the identities of the particles, this final state is there¬ 
fore completely specified by the momenta of these six particles, i.e. 18 numbers. We measure 
the momenta of the charged leptons and the jets from the hadronization of the 6 quarks di¬ 
rectly. In addition, the observed provides the x and y components of the sum of the 
neutrino momenta for a total of 14 measurements. Assuming rrit > M\y + rrih we can impose 
three constraints, two on the masses of the decaying W bosons, ^ = Mw, and 

one on the masses of the top quarks, This leaves us with 17 equations and 18 

unknowns so that a kinematic fit would be underconstrained. We have to develop a different 
procedure to obtain an estimate of the top quark mass from the available information. This 
is the fundamental difference between the mass determination in the dilepton channel and 
that in the lepton-|-jets channel, which allows a kinematic £t with two constraints. 



FIG. 2. Schematic representation of tt production and decay in the dilepton channels. 
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We solve this problem by fitting the dynamics of the decays m- For each event we 


derive a weight function, which is a measure of the probability density for a tt pair to decay 
to the observed hnal state, as a function of the top quark mass. We compare these weight 
functions to Monte Carlo simulations of tt decays for different values of the top quark mass 
and use a maximum likelihood £t to extract the mass value that yields the best agreement. 


B. Computation of the Weight Function 

Ideally we would like to compute analytically the probability density for a tt pair to 
decay to the observed hnal state for any given value of the top quark mass. This probability 
density is given by 

V{{o}\mt) (x J f{x)f{x)\M\'^p{{o}\{v})6^d^^{v}dxdx, (6) 

where {o} is the set of 14 measured quantities and {v} is the set of 18 parameters that specify 
the hnal state. A4 is the matrix element for the process qq or gg ^ tt + X —>• i~^uM~Vb + X, 
f{x) the parton density for quarks or gluons of momentum fraction x in the proton, and 
f{x) that for antiquarks or gluons of momentum fraction x in the antiproton. The detector 
resolution function p({o}|{n}) is the probability density to observe the values {o} given 
the hnal state parameters {n}. The four-dimensional ^-function enforces the four mass 
constraints: 

5^ = — Mw) x 6{m^ ^ — Mw) x — rrit) x 5{m^ — rrit). (7) 

Here we neglect the finite widths of the W boson and the top quark. 

Unfortunately this expression involves a multidimensional integral that has to be eval¬ 
uated numerically and is complicated by the need to include initial and hnal state gluon 
radiation. Such higher order ehects complicate the reconstruction of the top quark mass 
substantially and cannot be neglected. We therefore do not attempt to compute the exact 
probability density given in Eq. Rather, we construct simpler weights that retain sensi¬ 
tivity to the value of the top quark mass but can be evaluated with the available computing 
resources. We calibrate the ehect of the simplihcations by comparing the weight functions 
obtained from the collider data to Monte Carlo simulations (Sect. |V1). 

The calculation of the weight function proceeds in three steps. First we map the ob¬ 
served charged leptons and jets to the corresponding t and t decay products. There are 
ambiguities in this step because the fragmentation of the b quarks may result in more than 
one reconstructed jet or because a gluon radiated from the initial state may contribute a jet 
to the event. We cannot, in general, distinguish between jets originating from gluons and 
quarks. Furthermore, we do not measure the sign of the electron charge nor can we distin¬ 
guish between jets originating from quarks and antiquarks. Therefore, there is an ambiguity 
in pairing the charged leptons and b jets that originate from the same top quark. We repeat 
the following two steps for each of the possible assignments and add the resulting weight 
functions. 

Given the charged lepton and b quark momenta from the decay of the t and t quarks 
and the sum of the neutrino momentum components, and Py'', we compute a weight as 
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TABLE V. Possible assignments of three observed jets (ji, j 2 , and j^) to the b quarks and 
initial state radiation (ISR). 


Permutation 


6-Jets 


ISR 

1 

ji 


32 

33 

2 

h 


33 

h 

3 

h 


33 

3i 

4 

h + 32 


33 

— 

5 

32 + h 


31 

— 

6 

31 + h 


32 

— 


a function of the top quark mass. We have developed two algorithms to compute the weight 
function which emphasize different aspects of top production dynamics. The hrst algorithm 
(matrix-element weighting) is an extension of the weight proposed in Ref. and takes 


into account the parton distribution functions for the initial proton and antiproton and the 
decay distribution of the W bosons due to the V-A coupling of the charged current. The 
second (neutrino weighting) |P] is based on the available phase space for neutrinos from the 
decay of the tt pair. 

Finally we average the weight function over the experimental resolution. 

In the following, we hrst discuss the ambiguities in associating the observables with hnal 
state particles. Then we discuss the two algorithms that are used to compute the weight 
functions and hnally the experimental resolutions. 


C. Jet Combinatorics 

In the calorimeter we detect the jets from the fragmentation of the two b quarks. The 
fragmentation of a 6 quark can produce more than one jet because of hard gluon radiation. 
This corresponds to hnal state radiation. Jets can also originate from gluons radiated by 
partons in the initial state. We refer to this as initial state radiation. It is not possible to 
tell whether a jet originates from the fragmentation of a quark or a gluon, unless a b quark 
decays semileptonically to a muon that we subsequently detect. Thus, reconstruction of the 
original partons from the observed jets presents some complication. 

We consider jets with px > 15 GeV. If there are only two such jets we assign their 
measured momenta to the two b quarks. If there are more than two jets we have a range 
of possible assignments. To limit the possibilities, we restrict the procedure to the three 
leading jets in pt- We assign two of them to the b quarks and the third jet either to initial 
state radiation, in which case we ignore it, or to hnal state radiation, in which case we add 
its momentum to that of one of the two b quarks. There are six possible permutations for 
three jets, as listed in Table 0. 

If there is a jet in the event that is tagged by a soft muon, we only allow permutations 
that assign this jet to a 6 quark. In the collider data sample this is the case for one ee event. 
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Not all permutations are equally likely to be correct. For each jet considered to be due 
to initial state radiation, we assign a weight factor 


Qisr = exp 


-p?P sin 9^ 


^ 25 GeV J 

Similarly, for every pair of jets that is assigned to a 6 quark, we dehne 

f —m^ 

Qfsr = exp 




20 GeV, 


( 8 ) 


(9) 


where is the invariant mass of the two jets. These functional forms of the weights were 
derived empirically from a study of tt decays generated by is ajet . The factor Qisr favors 


assignments in which jets from initial state radiation are close to the beam direction, and 
Qfsr favors the merging of jets which are soft or close together. The numerical coefficients 
of the exponents are chosen such that the mean reconstructed top quark masses for events 
with two-jet and multi-jet hnal states are the same. 

After adding the four-momenta of the jets assigned to a 6 quark, we rescale the mo¬ 
mentum components, keeping the energy hxed, so that the b quark four-momentum has an 
invariant mass of 5 GeV to put the outgoing quark momentum on the mass shell. 

There are two ways to pair the momenta of the two charged leptons with the two b quark 
momenta. Since we cannot determine which b quark originated from the decay of the t quark 
and which from the decay of the t quark, we consider both pairings with equal probability. 


D. Matrix-Element Weighting (MWT) Algorithm 

Assuming that we know the momenta of the charged leptons , p^ ), the b quarks {p^, 
p^), and the sum of the x and y components of the neutrino momenta p^^) and that 
we impose the three constraints mentioned above, we are still one constraint short of being 
able to solve for the unknown components of the neutrino momenta. Assuming a hxed value 
for the top quark mass rrit supplies the required constraint to solve the problem, except for 
a fourfold ambiguity. Not all solutions are equally likely for any given value of rrit. We 
therefore assign a weight to the solution p0|| : 


= f{x)f{x)p{E- *\mt)p{Ef*\mt), 




( 10 ) 


where f{x) and fix), the parton distribution functions, are evaluated at = mf, and 
piE^*\mt) is the probability density function for the energy of the charged lepton in the rest 
frame of the top quark [E^*). This probability density is given by 


piE^*\mt) = 


ArritE^* {imf — mi — 2mtE^*) 

(mf - ml) - 2M^' 


( 11 ) 


We sum the weights for all solutions and normalize by a factor Aimt) to obtain the 
weight for the event 


w^{mt) = A{mt)J2'>^i^i^t)- 


i=l 


( 12 ) 
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The factor A{mt) ensures that the average weight is independent of the top quark mass. 
We compute the weight function for 82 < < 278 GeV in steps of 4 GeV, where the lower 

limit is given by the requirement that the top quark decays into a real W boson and a b 
quark and the upper limit is placed well above the measurement of the top quark mass in the 
lepton+jets channel. The normalization factor is computed using a Monte Garlo simulation 
so that 

N 

Y,w^imt)=N, (13) 

1 

where the sum is over the events that pass the selection cuts. We parametrize the factor 
A{mt) at different values of rrit (in GeV) as 

A{mt) = ( 5.86 - 0.044mt + 0.000084mi) . (14) 


E. Neutrino Weighting (nWT) Algorithm 


The neutrino weighting algorithm also computes a weight as a function of the top quark 
mass. In contrast to the AlWT algorithm it does not solve for the unknown neutrino 
momentum components, but rather samples the neutrino pseudorapidity space and computes 
a weight based on how much of the sampled space is consistent with the observed ^t- 

For every value of the top quark mass, we sample the rapidities of neutrino ( 77 ^) and 
antineutrino [r]^) from the it decay. For each top decay we then know the momenta of the 
charged lepton and the h quark, the assumed neutrino pseudorapidity, and the top quark 
mass, which allows us to solve for the transverse momentum components of the neutrino (p^ 
and Py) with a twofold ambiguity. The two solutions for each of the two top decays combine 
to give four solutions for the event. For the solution we compute a weight based on the 
agreement between the observed and the sum of the calculated neutrino pt values: 


w'({mt) = exp 


f-( 

1 

1 

to 

X exp 

f-( 

1 

1 

■73 

to 


2a2 


2a2 1 

V 



v 



( 16 ) 


where cr = 4 GeV is the resolution for each component of fx (Sect. |VIF|) . 

Not every value of the neutrino pseudorapidity is equally likely. Figure ^ shows the 
distribution of neutrino rapidities predicted by the herwig Monte Garlo program for several 
top quark masses. The distributions can be approximated by Gaussian curves. The width 
ay of the Gaussian varies as a function of the top quark mass. It can be parametrized by 
the second order polynomial 


ay = 5.56 X - 2.16 x + 1.314, (16) 

as shown in Fig. |^. We compute the weights for ten values of each of the neutrino 
rapidities, spaced such that they divide the Gaussian into slices of equal area. 

To obtain the weight for the event we add the weights for all four solutions and all values 
of the neutrino rapidities, 

V'' rf i=l 


( 17 ) 
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FIG. 3. Distributions of neutrino pseudorapidity from top quark decay, modeled by herwig, 
for several top quark masses. The smooth curves are fits to Gaussians. 
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FIG. 4. Width of the Gaussian curves fit to the neutrino pseudorapidity distributions as a 
function of top quark mass. The smooth line is the polynomial parametrization used in the analysis. 
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F. Detector Resolution 

The algorithms described in the two previous sections use as input the measured momenta 
of the charged leptons and b jets and the transverse components of the sum of the neutrino 
momenta. To account for hnite resolution, we integrate the weights over the ranges of these 
quantities that are consistent with the measurements to smooth out the weight functions. 

To evaluate this integral, we generate a large number of sets of event parameters over 
which we average the weights. These sets of event parameters derive from the observed 
events by adding normally distributed resolution terms to the observed values to populate 
the parameter space consistent with the measured values. The new values o are given in 
terms of the observed value o, the resolution, a, for the measurement of o, and a normally 
distributed random variable 


5 = 0 + cr^. (18) 

We apply such fluctuations to all momentum measurements. Directions are relatively precise 
and are therefore not fluctuated. This also reduces the number of numerical operations. 
The energy resolution for electrons is 

cT(h;") = 0.15 GeV^v^. (19) 


The resolution function for the inverse of the muon momentum is approximately Gaussian. 
We therefore fluctuate the inverse of the momentum with the resolution 

/1\ f/0.18(p^-2 GeV)\^ /0.003\2l^ 

/ ^(■^) J 

The energy resolution for jets receives contributions from several effects. One is the 
intrinsic resolution of the calorimeter. The energy of the jet is measured as the energy in 
a cone of radius AR = 0.5. This energy is not identical to that of the parton. Additional 
energy can be accrued from overlap with other jets and energy can be lost due to gluon 
radiation outside of the cone. These contributions to the resolution depend on the process 
and we therefore use Monte Garlo tt events to evaluate the jet energy resolution. 

We compare the reconstructed jet px to that of the nearest cluster of hadrons generated 
by the Monte Garlo in a sample of tt events with top quark masses ranging from 110 to 
190 GeV. Typically, the distribution in the fractional mismeasurement in px exhibits a 
narrow peak due to the intrinsic calorimeter resolution and broad tails due to ambiguity 
in the jet dehnition. We £t two Gaussian curves with equal means but different widths 
to the distribution, and parametrize the widths of the two Gaussians and their relative 
normalization as functions of px and p. Figure ^ shows a typical distribution along with the 
£t that we use as a resolution function. Figure ^ shows the rms resolution as a function of 
Px- 

The Monte Garlo simulation used to determine the jet energy resolution neither includes 
noise due to the intrinsic radioactivity of the uranium nor due to multiple interactions. We 
therefore add an additional uncorrelated constant noise term of 5-6 GeV, depending on p. 
These values were determined by balancing the px vectors in dijet events. 
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FIG. 5. Fractional pT resolution for jets with 50 < pr < 60 GeV from it decays generated with 
top quark masses between 110 and 190 GeV using the herwig program. The superimposed curve 
is the fit using two Gaussian curves. 



Pt (GeV) 

FIG. 6. Rms width of fractional jet pT resolution functions versus jet pT for three pseudora¬ 
pidity regions. 
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Using a sample of random pp interactions, we measure the resolution for any component 
of §T to be about 4 GeV. Both components of are fluctuated by this resolution. The 
vector is also corrected for the fluctuations in the lepton and jet momenta. 

The number of variations performed for each event is limited by the available computing 
power. We average over 100 variations per event for Monte Carlo samples and 5000 variations 
per event for the collider data. 

The weight function for each event is then 

1 N' 2 N" 

E E E QisrSfsrK'”(™i). (21) 

j=lk=l1=1 

where Qisr and Qfsr are the parametrized weights dehned in Eqs. || and |^. The index j 
runs over the N' resolution fluctuations, k over the two lepton-6 jet pairings, I over the N” 
jet permutations, and x refers to the AfWT or i/WT algorithms. 

Figure ^ shows W{mt) for the dilepton events for the AfWT analysis and Fig. § shows 
the corresponding functions for the i^WT analysis. 



FIG. 7. W(mt) functions for the dilepton events from the AlWT analysis. The labels in the 
upper right hand corners identify the events (cf. Table IV). 


G. Monte Carlo Tests 

We now describe tests of the properties of the weight functions to demonstrate their 
sensitivity to the top quark mass and other parameters. 
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FIG. 8. W{mt) functions for the dilepton events from the z^WT analysis, 
upper right hand corners identify the events (cf. Table IV). 


The labels in the 


1. Parton-level Tests 


Parton-level tests are based on the momenta of the partons generated by the Monte 
Carlo simulation. Tests at this level are neither subject to effects from detector resolution 
nor initial or hnal state radiation. To restrict the sample to events that are broadly similar 
to those which enter the collider data analysis, the event selection for these tests requires 
two b quarks and two leptons with pt > 20 GeV and |? 7 | < 2.5. 

We examine the average weight function as a function of input top quark mass by nor¬ 
malizing the area of the weight function for each event to unity and then summing these 
normalized functions for a collection of Monte Carlo events. A sample of 10,000 events was 
used, about half of which passed the cuts. The results are shown in Fig. ^ for top quark 
masses of 130 and 190 GeV. On average, the weight function is sharply peaked within one 
GeV of the input mass. The tails of the function are asymmetric, with the high-end tail 
extending further than the low-end tail. 


Figure shows the impact of detector resolution, jet combinatorics, and radiation on 
the weight functions for 190 GeV Monte Garlo events. The distribution becomes signihcantly 
broader when resolution effects and both lepton-6 jet pairings are considered, but the peak 
value remains unchanged. Initial state radiation increases the mean value and adds a high- 
mass tail, as expected. Final state radiation has the opposite effect. In total, the effect of 
resolution, combinatorics, and radiation is to broaden the distribution of the weight function 
and move the peak of the distribution away from the input mass. 
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FIG. 9. Average parton-level weight W{mt) for tt decays with (a) mt = 130 GeV and (b) 
rrit = 190 GeV for the i^WT algorithm. The vertical lines indicate the input mass values. 
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FIG. 10. Average parton-level weight functions for the i/WT algorithm, obtained (a) with the 
parton momenta smeared by the detector resolutions, (b) with the two-fold ambiguity in lepton-jet 
pairings included, (c) with ISR but without FSR, and (d) without ISR but with FSR. The vertical 
lines indicate the input mass value of 190 GeV. 
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2. Tests using Full Simulation 


To quantitatively assess the response of the htting algorithm to events from the D0 
data sample that pass the kinematic selection described in Sect. 0, we use fully simulated 
samples of herwig ti decays. In contrast to the parametrized detector response used in the 
parton-level tests, these samples derive from a detailed detector model implemented using 
the GEANT program. The events are processed with the same reconstruction program and 
hltered using the same kinematic criteria as for the collider data. 

Figures 0 0, and |T^ show the average weight functions for the full simulation of all 
three dilepton channels. Both the kinematic cuts and the additional complexity of the col¬ 
lider environment further degrade the resolution from that obtained in parton-level tests. In 
particular, for top quark masses less than 140 GeV, the distributions are distorted signih- 
cantly by the Ht cut. This distortion reduces the precision with which a top mass value in 
this range can be measured. It does not, however, introduce any bias in our top mass deter¬ 
mination since the effect of the Ht cut is modeled in the probability distribution functions 
used for the mass hts (Sect. ES)- 
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FIG. 11. Average weight functions for fully simulated tt decays events in the ep channel from 
the AlWT analysis (solid line) and the i^WT analysis (dashed line). 


The weight distributions become less sharp as the number of muons in the hnal state 
increases, reflecting the relatively poor measurement of their momenta. This effect is more 
pronounced for the z/WT analysis. For this reason, and also because the signal to background 
ratio is signihcantly higher for the ep channel than for the ee or pp channels, it is important 
to treat the three channels separately when extracting the top quark mass. 
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FIG. 12. Average weight functions for fully simulated tt decays in the ee channel from the 
A^WT analysis (solid line) and the i^WT analysis (dashed line). 
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FIG. 13. Average weight functions for fully simulated tt decays events in the channel from 
the AIWT analysis (solid line) and the vWT analysis (dashed line). 
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VII. MASS FITS 


A. General Procedure 


We estimate the top quark mass by comparing weight functions from Monte Carlo it 
samples, generated at different values of the top quark mass, with the weight functions for 
the collider data. We use a maximum likelihood £t to find the value of the top quark mass 
for which the Monte Carlo predictions agree best with the data. 

For each dilepton event, we compute the weights W{mt) at 50 values of the top quark 
mass between 80 and 280 GeV. To £t these 50 values directly we would need the probability 
density as a function of 50 arguments, which is impractical. We can, however, reduce the 
number of quantities without losing too much information. The individual weight functions 
are much broader than the size of the steps for which the weights are computed. As shown 
in Figures their rms is 35-40 GeV. Therefore, we integrate the weights over hve bins 


40 GeV wide, as shown in Fig. |^. Since we need information only about the shape of the 
weight function, we normalize the area under the function to unity, such that the integrals 
over four of the bins are independent quantities. We thereby reduce the weight function for 
each event to the four-dimensional vector 


W= (hFi,hF2,W3,fF4), 


where 


r-120 GeV 


Wi= W (m) dm 

Jso GeV 

and W 2 , W 3 , and HA are computed analogously. 

We now maximize the joint likelihood 


( 22 ) 


(23) 


L = 


1 (rr* + e 




N\ 


nsfs{Wi\mt) UbfbiWi) 

^ 11 


Us + Ub 


(24) 


with respect to the parameters Ug (the expected number of signal events), nb (the expected 
number of background events), and mt (the top quark mass). The product is taken over all 
events. The first term in the likelihood is a Gaussian constraint that forces the expected 
number of background events to agree with the background estimate Ub within its uncertainty 
Gb- The second is a Poisson constraint that forces the expected number of events to be 
consistent with the observed number of dilepton events N. The remaining part is the 
probability density for the vector Wi for the collider data for Ug signal and nb background 
events. Here fg is the probability density function for signal and fb for background events. 


We maximize L with respect to Ug and nb at each value of mt using the minuit program 
to eliminate the nuisance parameters ng and nb- We are left with L at the discrete values 
of mt for which we have Monte Garlo samples. Each dilepton channel is treated separately 
in this £t and the hnal likelihood L is the product of the likelihoods from each channel. We 
fit a polynomial to — In L, the minimum of which gives the measured value of the top quark 
mass. 

The following sections describe the derivation of the probability density function for W, 
the parametrization of the likelihood functions, and the £t results. 
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m, (GeV) 

FIG. 14. The weight function for a typical Monte Carlo event, normalized to unity. The vertical 
lines show the five intervals over which the weight function is integrated. 


B. Probability Density Estimation 


To estimate the continuous functions fs and fb from the discrete sample of Monte Carlo 
points available for each value of rrit would require a prohibitively large number of Monte 
Carlo events to populate the four dimensional parameter space. We therefore use a proba¬ 
bility density estimation (PDE) technique employing continuous kernels . 

Consider that each event in the sample is characterized by a set of d uncorrelated values, 
which are grouped into the d-dimensional vector (. Then the probability density / for any 
( can be estimated based on a sample of Monte Carlo events as 


ATMC 


/(C) = 


N^Chd 




2 = 1 


C-Ci 

h 


,c 


(25) 


where C is the covariance matrix for the components of h is a free parameter, and K is 
the kernel function. 

Any function which is maximal at zero and asymptotically approaches zero as the abso¬ 
lute value of its argument becomes large would be an acceptable choice for K. For simplicity, 
we choose a multidimensional Gaussian. In our application, the results of applying either 
the AfWT or i/WT techniques to an event is the 4-dimensional vector W. The elements 
of W are highly correlated, and so a linear transformation must be applied to the data to 
remove the correlations before using Eq. 


W' = AW. 


(26) 


The transformation matrix A is chosen so that the covariance matrix C of the transformed 
variables is diagonal. It can be shown that for two distinct sources of events (signal and 
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background in our case), there exists a unique matrix A which results in the covariance ma¬ 
trix for one source to be the identity matrix I and that from the other source to be a general 

m 


diagonal matrix D 


We choose to have C be the identity matrix for background. The 


matrix A is computed only once, using the distribution of Monte Carlo tt events generated 
at all top quark masses. After transformation, the kernel function has the form: 


K 




1 

y/‘27lCj 


exp 





(27) 


where the Cj are the diagonal elements of C. 

One minor extension of this method is needed to properly model the background. As 
described in Sec. 0, the backgrounds in the dilepton channel arise from a variety of sources. 
We assign weight factors bj such that their contribution to the probability density corre¬ 
sponds to the relative strengths of the n background sources: 


h,Nf^ Uj 


(28) 


where is the number of Monte Carlo events and Uj is the number of events expected 

from the background source. The estimate for the probability density for an event weight 
vector W is then given by: 


fsiW\mt) 


N 


Nh^ ^ 




2=1 



for signal and 


MW) 


NMC 


'M'’’ 




2=1 



(29) 


(30) 


for background. 

The remaining step is to £x the value of the free parameter h to maximize the expected 
resolution of the measurement. Using the ensemble test method described below, we hnd 
that values of h in the range 0.1 - 0.4 are preferred, and we choose h = 0.3. 


C. Ensemble Tests 

Ensemble tests are mock experiments in which the dilepton events are simulated using 
a Monte Carlo program with a known top quark mass (m^^) and processed in exactly the 
same manner as the collider data. The procedure is as follows: if there are Nj events in the 
M decay channel, we draw Nj events from the MC samples for this decay channel. We then 
select a random number between 0 and 1 for each event. If the random number is greater 
than Uj/Nj, we take an event from the signal sample. Otherwise we select an event from the 
background sample. If there are multiple sources of background, another random number is 
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TABLE VI. Results of ensemble tests using the i/WT algorithm showing the effect of different 
parametrizations of the — InL function. The fits are polynomials of degree m to n points. 


Fit 

n 

m 

Median 

GeV 

= 150 GeV 

Mean 

GeV 

GeV 

Median 

GeV 

= 200 GeV 

Mean 

GeV 

R68 

GeV 

5 

2 

152.2 

154.1 

13.4 

198.1 

197.8 

18.6 

7 

2 

151.6 

154.0 

13.0 

198.2 

198.1 

19.0 

9 

2 

151.9 

154.5 

13.6 

198.8 

199.4 

18.9 

9 

3 

151.6 

151.8 

13.3 

196.0 

190.0 

19.6 

11 

3 

151.9 

152.5 

13.8 

193.4 

196.3 

19.3 


selected in order to decide the source of background from which to draw the event. We then 
£t the ensemble using the maximum likelihood procedure described above. We repeat this 
procedure for a large number of ensembles (typically 1000). In this manner we can gauge 
the statistical properties of the maximum likelihood estimate of the top quark mass, rfp. 

We characterize the width of the (in general not Gaussian) distribution of £t results by 
half the length of the shortest interval in rrit that contains 68.3% of the ensembles, i?®®. 


D. Parametrization of the Likelihood Function 


We £t a polynomial to the values of — In L computed for different top quark masses. 
The fitted top quark mass is the value of mt for which the polynomial assumes its minimum 
— IuLq. The statistical uncertainty 6mt due to the finite size of the event sample is given 
by half of the interval in rrit for which — In L < — In Lq + |. 

We have a choice of what order polynomial, and how many points around Lq, to include 
in the £t. The values of mt and Smt returned by the £t depend on these choices. We 
therefore perform ensemble tests to select the choice that gives the most accurate values. 
For the htted top quark mass this means agreement with the input mass used to generate the 
ensembles. For the uncertainty it means agreement with the observed scatter of ensemble 
results. 

We £t quadratic and cubic polynomials to hve to eleven points, centered on the point of 
maximum likelihood. Table ^ gives the results of ensemble tests using these fitting options. 
The cubic does not improve the accuracy of the fitted mass and we therefore choose to fit 
the — In L points with a quadratic polynomial. 

The width of the htted quadratic polynomial increases with the number of points included 
in the £t. We choose the number of points that results in pull distributions of unit widths. 
If frit is an unbiased estimate of m^^ with a Gaussian resolution of width 6frit, then the pull 


s = 


mt — m 


MC 

t 


Smt 


(31) 


is normally distributed around zero with unit width. We £t Gaussians to histograms of the 
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TABLE VII. Pull means and widths from ensemble tests of the AIWT algorithm. 


^MC 

GeV 

n = 5 

Width 

n = 7 

Width 

Width 

n = 9 

Mean 

130 

1.16 

0.90 

0.79 


0.65 

140 

1.01 

0.90 

0.81 


0.38 

150 

1.12 

0.95 

0.87 


0.13 

160 

1.34 

1.12 

1.03 


0.12 

170 

1.26 

1.08 

0.99 


0.11 

180 

1.24 

1.08 

0.98 


0.00 

190 

1.12 

1.02 

1.03 


-0.06 

200 

1.17 

1.10 

1.06 


-0.11 

210 

1.09 

1.04 

1.04 


-0.09 



TABLE VIII. Pull means 

and widths from ensemble tests of the i^WT algorithm. 

^MC 

GeV 

n = 5 

Width 

n = 7 

Width 

Width 

n = 9 

Mean 

130 

1.22 

1.04 

1.04 

0.58 

140 

1.09 

0.97 

0.88 

0.40 

150 

1.03 

0.92 

0.86 

0.16 

160 

1.18 

0.99 

0.96 

0.17 

170 

1.17 

1.06 

0.98 

0.08 

180 

1.27 

1.11 

1.03 

0.03 

190 

1.16 

1.05 

0.99 

-0.07 

200 

1.07 

1.10 

1.02 

-0.08 

210 

1.08 

1.01 

1.03 

-0.08 


pulls for all ensembles generated with the same The pull widths are tabulated in 

Table [VII| for the ATWT algorithm and in Table [VII1| for the i/WT algorithm. 

The fits that include only hve points underestimate 6 fnt. The nine point hts give pull 
widths closest to unity over the whole range of rrit. Therefore we choose to fit the quadratic 
polynomial to nine points for the final results. The pull distributions for ensemble tests at 
a variety of top quark masses are shown in Fig. for the ATWT algorithm and in Fig. ^ 
for the i/WT algorithm. 

Tables and ^ list the median and mean fitted top quark masses from ensemble tests 
using a quadratic £t to nine points. The differences between m^t and at masses below 150 
GeV can be traced to the small number of events available to model some of the backgrounds 
{Z —>■ ii, WW). For these background processes the selection efficiency is so low that a 
signihcant increase in the number of Monte Carlo events that satisfy the selection criteria 
is not possible due to limited computing resources. When we replace these small samples 
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m, = 140 GeV m, = 160 GeV 





FIG. 15. Pull distributions for the AlWT algorithm. The smooth curves are fits to Gaussians. 


m, = 140 GeV m, = 160 GeV 





FIG. 16. Pull distributions for the i^WT algorithm. The smooth curves are fits to Gaussians. 
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TABLE IX. Median and mean of the fitted top quark masses and 68 % confidence intervals 
from ensemble tests of the AIWT algorithm. 


^MC 

GeV 

Median 

GeV 

Mean 

GeV 

7768 

GeV 

130 

138.1 

138.3 

13.6 

140 

144.6 

147.1 

12.7 

150 

151.6 

153.4 

12.8 

160 

161.6 

163.9 

15.8 

170 

172.2 

173.7 

16.7 

180 

180.5 

181.0 

17.3 

190 

189.5 

190.5 

17.8 

200 

200.3 

200.1 

19.5 

210 

210.0 

210.9 

21.4 


TABLE X. Median and mean of the fitted top quark masses and 68 % confidence intervals from 
ensemble tests of the z/WT algorithm. 


^MC 

GeV 

Median 

GeV 

Mean 

GeV 

7768 

GeV 

130 

138.2 

139.8 

18.1 

140 

145.9 

147.5 

13.9 

150 

151.9 

154.5 

13.6 

160 

161.5 

163.5 

14.4 

170 

172.2 

173.0 

16.2 

180 

180.5 

181.3 

18.1 

190 

188.7 

189.6 

17.7 

200 

198.8 

199.4 

18.9 

210 

210.1 

210.0 

20.2 
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with large samples picked randomly from a smooth distribution these differences vanish. 
For htted masses above about 150 GeV, these differences become small. We choose not 
to correct the results for this effect. It is included in the uncertainty assigned to the £t 
procedure in Sect. [VIII F| . Figures and |T^ show that for the two algorithms, the peak of 
the frit distribution is consistent with 


m,= 140GeV m, = 160GeV 



FIG. 17. Distribution of frit from ensemble tests of the AlWT algorithm. The arrows point to 
the input mass. 


E. Results 


Applying the procedure outlined above to the dilepton event sample, we find 

mt = 168.2 ± 12.4 (stat) GeV (32) 

for the AlWT algorithm and 

mt = 170.0 ± 14.8 (stat) GeV (33) 


for the z/WT algorithm. Figures and ^ compare J2i kF^ for collider data to the htted 
signal plus background shapes. The insets show the corresponding fits to — InL. 

In Figures ^(a) and ^(a) we compare the statistical uncertainties for the AIWT and 
z^WT analyses with the distribution of i?®® observed in ensemble tests with = 170 GeV. 
For the AIWT analysis there is a 21% probability to obtain a smaller statistical uncertainty 
than 12.4 GeV and for the z/WT analysis there is a 47% probability to obtain a smaller 
statistical uncertainty than 14.8 GeV. The pull distributions indicate that is a good 
estimate of the statistical uncertainty. We verify this by considering the subset of ensembles 
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m, = 140 GeV 


m, = 160GeV 



m, (GeV) 


m, (GeV) 


FIG. 18. Distribution of ruf from ensemble tests of the i^WT algorithm. The arrows point to 
the input mass. 



FIG. 19. Summed event weight function Wi for the data sample (points), the fitted signal 
plus background (solid), and the background alone (dashed) for the AlWT algorithm. The error 
bars indicate the rms observed for five event samples in ensemble tests. The inset shows the 
corresponding fit to — InL, drawn as a solid line in the region considered in the fit. 
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FIG. 20. Summed event weight function for the data sample (points), the fitted signal 

plus background (solid), and the background alone (dashed) for the i^WT algorithm. The error 
bars indicate the rms observed for five event samples in ensemble tests. The inset shows the 
corresponding fit to — InL, drawn as a solid line in the region considered in the fit. 

with 5fnt consistent with the observed value. Figures |^(b) and ^(b) show the distribution 
of mass estimates mt for the ensembles with 5fnt between the dashed lines in (a). The widths 
of all such ensembles are consistent with the observed values of 5mt. 

The e/r channel, with the largest number of events and smallest background, should 
dominate the result of the £t, while the /i/r channel with only one event and a sizeable 
background should have the least effect. We therefore also £t separately the hve events from 
the ee and e/r samples and the three e/r events. Table lists the results. This table also 
shows the effect of varying the degree of the polynomial used to fit — In L and the number 
of points included in the fit. No excursions comparable to the statistical uncertainty of the 
measurement are seen in the results of any of these variations. 


VIII. SYSTEMATIC ERRORS 
A. Estimation of Systematic Uncertainties 

Systematic uncertainties give rise to biases in the result of the analysis no matter how 
many events are analyzed. They are due to differences between the collider data and our 
signal or background models. Variation in the event selection or the fit procedure, which in 
general also result in a change in the final result when applied to a small sample of events, do 
not represent systematic uncertainties. Rather, these are statistical effects and are properly 
accounted for by our use of a maximum likelihood fit to define the statistical uncertainty. 
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FIG. 21. (a) Distribution of uncertainties 5fnt obtained from ensemble tests for the A^WT 

algorithm with = 170 GeV. The arrow marks the value returned by the fit to the data (12.4 
GeV). (b) Distribution of mt for the ensembles with 6mt between the dashed lines in (a). 



FIG. 22. (a) Distribution of uncertainties 5fnt obtained from ensemble tests of the z/WT 

algorithm with = 170 GeV. The arrow marks the value returned by the fit to the data (14.8 
GeV). (b) Distribution of mt for the ensembles with 6fnt between the dashed lines in (a). 
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TABLE XL Results of several variations of the maximum likelihood fit to the data. The fits 
are polynomials of degree m to n points. 


Channels 


Fit 


Fitted Mass (GeV) 


n 


m 

MWT 

uWT 

ee, ep, pp 

5 


2 

166±12 

169±11 


7 


2 

168±12 

170±13 


9 


2 

168±12 

170±15 


11 


3 

16711^ 

171±16 

ee, ep 

5 


2 

166±13 

173±12 


7 


2 

167±12 

172±15 


9 


2 

168±13 

173±14 


11 


3 

leelji 

172111 

ep 

5 


2 

173±15 

169±14 


7 


2 

173±13 

169±13 


9 


2 

173±13 

170±15 


11 


3 


170111 


Systematic uncertainties can, in general, be estimated using ensemble tests in which a 
mismatch is introduced between the conditions under which the ensembles are created, and 
the assumptions used in the probability density estimation. In most cases we vary conditions 
in the ensembles and then analyse them with the same probability density functions used 
for the collider data, i.e., assuming the nominal conditions. Any deviation of the fitted mass 
values from the mass used when generating the ensembles indicates a systematic effect. Due 
to the finite number of Monte Carlo events available, these systematic effects can be esti¬ 
mated with an uncertainty of about 1 GeV. Table PCII| summarizes the sources of systematic 
uncertainties and their estimated magnitudes. The estimated uncertainties differ insignif¬ 
icantly between the two algorithms so that we use the average of the uncertainties from 
both analyses, weighted by the respective statistical uncertainty in the measured top quark 
mass, as an estimate for both algorithms. The following sections describe the individual 
uncertainties in more detail. 


B. Jet Energy Scale 


To propagate the jet energy scale uncertainty (section |IV Q ) to the top mass measure¬ 
ment, we generate signal Monte Carlo samples (m* = 170 GeV) and background samples 
with jet energy responses one standard deviation higher and lower than the nominal re¬ 
sponse. We also scale the energy in the calorimeter that is not included in any jet by the 
same factor as the jets, and the is recomputed to reflect the scale change. We then create 
Monte Carlo ensembles from the scaled samples and fit them using the probability density 
functions generated with the nominal jet energy response. Table pGll| shows the results of 
this mismatch in jet energy scale. Averaging the upward and downward excursions of the 
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TABLE XII. Summary of systematic uncertainties for the dilepton mass fits. 

Source 

MWT 

Uncertainty (GeV) 
z^WT 

average 

Jet Energy Scale 

2.0 

2.9 

2.4 

Multiple Interactions 

1.4 

1.2 

1.3 

Background Model 

0.9 

1.5 

1.1 

Signal Generator 

2.3 

1.1 

1.8 

Monte Carlo Sample Size 

0.3 

0.3 

0.3 

Likelihood Eit 

0.9 

1.3 

1.1 

Total 

3.5 

3.9 

3.6 


TABLE XIII. Effect of varying the jet energy response in ensemble tests with rrit 

= 170 GeV. 

Jet Scale 

Median m* (GeV) 



MWT 

nWT 

+2.5% + 0.5 GeV 

172.9 

174.0 

Nominal 

172.2 

172.2 

-2.5% - 0.5 GeV 

168.9 

168.3 


median results in a systematic uncertainty of 2.0 GeV for the AfWT algorithm and 2.9 GeV 
for the i^WT algorithm. 


C. Signal Monte Carlo Generator 


The accurate determination of the top quark mass depends on the signal Monte Garlo 
providing a faithful description of tt events. Some features, in particular gluon radiation and 
parton fragmentation, are only modeled approximately by HERWIG and other reasonable 
approximations exist. In the absence of large samples of tt events, none of them can be 
directly excluded. To test the sensitivity of the result to the Monte Garlo generator, we 
generate ensembles of events with the isajet event generator. We simulate the detector 
response using geant and analyse them in the standard way. We then £t the weight 
functions of ensembles of these events with the probability density functions obtained from 
Monte Garlo events generated by the HERWiG program. Tables |XIV| and [XV| list the results. 
For a given top quark mass, we take the difference AMedian between the medians of the 
results from the ISAJET samples (Tables PCIV| and PCV|) and the HERWIG samples (Tables p(| 
and 0- We compute the average of the magnitude of these differences for all top quark 
masses, 2.3 GeV for the AfWT algorithm and 1.1 GeV for the z/WT algorithm, and assign 
these values as the systematic uncertainty in the top quark mass measurement. 

In addition, we have performed studies to directly assess the impact of gluon radiation 
by varying the fraction of events with gluon radiation in a HERWIG Monte Garlo sample by 
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TABLE XIV. Results of analyzing ensembles of events generated by isajet with the AlWT 
algorithm. 

GeV 

Median 

GeV 

Mean 

GeV 

R68 

GeV 

AMedian 

GeV 

AMean 

GeV 

140 

143.6 

145.0 

14.4 

-1.0 

-2.1 

150 

151.0 

151.6 

14.3 

-0.6 

-1.8 

160 

160.0 

161.4 

16.4 

-1.6 

-2.5 

170 

169.0 

168.6 

17.3 

-3.2 

-5.1 

180 

178.0 

178.4 

18.0 

-2.5 

-2.6 

190 

186.2 

186.9 

19.8 

-3.3 

-3.6 

200 

197.2 

196.1 

20.2 

-3.1 

-4.0 

210 

206.7 

206.1 

22.1 

-3.3 

-4.8 


TABLE XV. Results of analyzing ensembles of events generated by isajet with the z^WT 
algorithm. 

GeV 

Median 

GeV 

Mean 

GeV 

R68 

GeV 

AMedian 

GeV 

AMean 

GeV 

140 

145.9 

147.8 

15.6 

0.0 

0.3 

150 

152.6 

154.4 

15.4 

0.7 

-0.1 

160 

160.1 

161.6 

15.8 

-1.4 

-1.9 

170 

170.8 

171.6 

17.6 

-1.4 

-1.4 

180 

179.1 

179.5 

18.2 

-1.4 

-1.8 

190 

189.4 

188.7 

18.5 

0.7 

-0.9 

200 

198.6 

198.3 

19.5 

-0.2 

-1.1 

210 

206.8 

205.6 

20.3 

-3.3 

-4.4 


50%. This results in a change of 1.3 GeV in the measured top quark mass, which is quite 
consistent with the uncertainties quoted above based on herwig-isajet differences. 

We studied the sensitivity of the results to variations in our choice of parton distribution 
functions. We expect the sensitivity to parton distribution functions to be larger for the 
AfWT analysis because it uses them explicitly in the mass reconstruction. Our default 
choice is the CTEQ3M set of parton distribution functions [^]. We also perform ensemble 


tests with weight functions derived using MRSA' parton distribution functions m with 
three different values of Aqcd- The Monte Carlo events for the ensembles were generated 
with an input mass of 170 GeV and CTEQ3M parton distribution functions in the generation 
and the top mass reconstruction. The results are summarized in Table |XV1| . The variation 
in the median of the ensemble tests is 20 MeV. We conclude that any sensitivity to parton 
distribution functions is negligible compared to other systematic effects in the generation of 
the Monte Carlo samples. 
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TABLE XVI. Results of varying the choice of parton distribution functions (pdf) in the AIWT 
analysis. 


pdf 

Median 

GeV 

Mean 

GeV 

CTEQ3M 

172.25 

173.67 

MRSA' (Aqcd = 266 MeV) 

172.27 

173.66 

MRSA' (Aqcd = 344 MeV) 

172.27 

173.51 

MRSA' (Aqcd = 435 MeV) 

172.26 

173.38 


TABLE XVII. Effect of introducing dummy models for the poorly modeled portion of the 


background. 


Background Model 

MWT 

Median rrp (GeV) 

i^WT 

Low Mass 

172.9 


172.7 

Nominal 

172.2 


172.2 

High Mass 

172.0 


171.2 


D. Background Shape 


The modeling of the background also depends on a Monte Carlo simulation. In addition, 
for some sources of background {Z ££, WW) very few Monte Carlo events satisfy the 
selection criteria. To estimate how sensitive the result is to the poorly constrained distri¬ 
bution of these events, we use dummy models instead of the Monte Carlo samples. These 
models assume that the W(mt) distributions for these backgrounds are Gaussian, with a 
width chosen randomly between 20 and 60 GeV. In one of the models (“low mass”), the 
mean of the Gaussian was randomly selected between 120 and 160 GeV, and in the other 
(“high mass”) between 180 and 220 GeV. We then perform ensemble tests using the known 
background components plus the dummies to estimate the background probability densities, 
with events drawn from the standard signal and background models. The results are listed 
in Table DC V 111. Based on the observed shifts in the median rfp the uncertainties are 0.9 GeV 


and 1.5 GeV for the ATWT and i/WT analyses, respectively. 


E. Multiple Interactions 

The beams in the Tevatron are structured into six proton and six antiproton bunches. 
Proton and antiproton bunches collide every 3.5 ps in the center of the detector. More than 
one pp interaction can take place during a crossing and the detector sees the superposition 
of all these interactions. At the mean luminosity at which the data were taken (7.5 x 
10^°/cm^/s) on average 1.3 interactions occur per crossing. Since the cross section for the 
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production of high-pj’ secondaries is small, it is very unlikely that more than one of these 
interactions produces high-p-p particles or jets. However, the Monte Carlo models do not 
include the effect of the additional low-pr particles due to multiple interactions during the 
same crossing. 

There are two ways in which these additional interactions may affect the reconstructed 
event. First, the additional particles deposit energy in the calorimeter, some of which falls 
into the jet cones. Second, the additional tracks may confuse the algorithm that determines 
the ; 2 -position of the interaction vertex, leading to mismeasurement of the jet directions. The 
jet energy scale calibration accounts for the former effect on average. To study the latter 
effect, we add particles from one or two simulated additional pp interactions to a sample 
of 5000 Monte Carlo tt decays with rrit = 170 GeV. The signatures of the resulting events 
in the detector are simulated by the geant program. The events are reconstructed by the 
same programs as the collider data. For this study ensemble tests are of little help, since the 
small sample sizes prohibit the generation of a large number of independent ensembles. We 
estimate the size of the systematic effect by comparing the W{rrit) distributions in the sam¬ 
ples with zero, one, and two additional interactions. Althongh the resolntion of the vertex 
degrades with the additional interaction, the effect on the W{mt) distribntion is modest. 
The difference in mean between a sample withont additional interactions and the sample in 
which 33% of the events have one and 36% two additional interactions, approximating the 
conditions at which the data were taken, is only 0.6 GeV for the z/WT analysis. A change 
of this magnitude is roughly equivalent to a change of 1.2 GeV in the top quark mass. For 
the AfWT analysis we get a similar value, 1.4 GeV. 


F. Likelihood Fit and Monte Carlo Statistics 

There are systematic uncertainties in the valne of the top qnark mass that minimizes 
— InL. These arise both from the hnite number of Monte Carlo events used in determining 
the — InL points and the choice of function to £t these points. 

To estimate the effect of the Monte Carlo sample size, we split the signal Monte Carlo 
samples into five subsets and repeat the fit to the data using each snbset as the signal model. 
The rms variation observed in the central valne is then divided by v^, yielding a systematic 
uncertainty of 0.3 GeV for either algorithm. 

To estimate the nncertainty arising from the choice of the parabolic fit to nine likelihood 
points, we fit Monte Carlo ensembles with rrit = 170 GeV using a variety of parametrizations 
and observe the resulting changes in the median of frit. We fit qnadratic polynomials to five 
and seven points and cnbic polynomials to nine and eleven points. The largest variations of 
0.9 GeV (AfWT) and 1.3 GeV (z/WT) give estimates of the systematic uncertainties. 


IX. RESULTS 

A. Combination of the MWT and nWT Measurements 

The two algorithms we use give consistent results. The weights computed by the AfWT 
and z/WT algorithms are based on different aspects of tt prodnction and decay and are 
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therefore not completely correlated. To gauge the degree of correlation, we £t ensembles of 
tt Monte Carlo events for a top quark mass of 170 GeV using both algorithms. We then select 
the subset of these ensembles with likelihood functions of similar widths as observed in the 
data {i.e. those for which the AlWT analysis yields 11.4 < 6 mt < 13.4 GeV and the z/WT 
analysis yields 13.8 < 6 mt < 15.8 GeV). Based on these tests we hnd that the correlation 
coefficient between the AlWT and z/WT algorithms is 0.77. A statistical combination of the 
results from the two algorithms then yields 

rrit = 168.4 ± 12.3 (stat) ± 3.6 (syst) GeV. (34) 

The systematic uncertainties are taken as completely correlated between the two algorithms. 
Since they differ insignihcantly between the two algorithms we quote the mean from Table 

XlTj 


B. Combination of the Dilepton and Lepton+Jets Measurements 

The value of the top quark mass obtained from the dilepton channel is in good agreement 
with that found by htting tt —>• £+jets events @], supporting the hypothesis that both are due 
to the decays of the same pair-produced particles. We obtain our best measurement of the 
mass of the top quark by combining the results of the analyses in the two channels. Since 
the two measurements are statistically independent the combination is straight forward. 
The systematic uncertainties in the combined measurement are evaluated by propagating 
the uncertainties in each channel with correlation coefficients of either 0 (for MG statistics, 
likelihood £t, and background model) or 1 (for jet energy scale, multiple interactions, and 
HERWIG-ISAJET differences). We obtain 

mt = 172.1 ± 5.2 (stat) ± 4.9 (syst) GeV. (35) 

The effective correlation coefficient between the two measurements is 0.15. If we neglected 
all correlations the result would change by less than 200 MeV. 


C. Conclusions 

We have reported the measurement of the top quark mass using six dilepton events. We 
use maximum likelihood hts to the dynamics of the decays to achieve maximum sensitivity 
to the mass of the top quark. We developed two algorithms for the computation of the 
likelihood that exploit complementary features of tt production and decay. Both result in 
very similar measurements of the top quark mass. They also agree well with the mass 
measured from hts to tt ^ i + jets events, supporting the hypothesis that both channels 
correspond to decays of the same particle. We combine the mass measurements from both 
channels to obtain 


mt = 172.1 ±7.1 GeV. 


(36) 
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